22 research outputs found

    Data-driven modeling of collaboration networks: A cross-domain analysis

    Full text link
    We analyze large-scale data sets about collaborations from two different domains: economics, specifically 22.000 R&D alliances between 14.500 firms, and science, specifically 300.000 co-authorship relations between 95.000 scientists. Considering the different domains of the data sets, we address two questions: (a) to what extent do the collaboration networks reconstructed from the data share common structural features, and (b) can their structure be reproduced by the same agent-based model. In our data-driven modeling approach we use aggregated network data to calibrate the probabilities at which agents establish collaborations with either newcomers or established agents. The model is then validated by its ability to reproduce network features not used for calibration, including distributions of degrees, path lengths, local clustering coefficients and sizes of disconnected components. Emphasis is put on comparing domains, but also sub-domains (economic sectors, scientific specializations). Interpreting the link probabilities as strategies for link formation, we find that in R&D collaborations newcomers prefer links with established agents, while in co-authorship relations newcomers prefer links with other newcomers. Our results shed new light on the long-standing question about the role of endogenous and exogenous factors (i.e., different information available to the initiator of a collaboration) in network formation.Comment: 25 pages, 13 figures, 4 table

    Quantifying knowledge exchange in R&D networks: A data-driven model

    Full text link
    We propose a model that reflects two important processes in R&D activities of firms, the formation of R&D alliances and the exchange of knowledge as a result of these collaborations. In a data-driven approach, we analyze two large-scale data sets extracting unique information about 7500 R&D alliances and 5200 patent portfolios of firms. This data is used to calibrate the model parameters for network formation and knowledge exchange. We obtain probabilities for incumbent and newcomer firms to link to other incumbents or newcomers which are able to reproduce the topology of the empirical R&D network. The position of firms in a knowledge space is obtained from their patents using two different classification schemes, IPC in 8 dimensions and ISI-OST-INPI in 35 dimensions. Our dynamics of knowledge exchange assumes that collaborating firms approach each other in knowledge space at a rate ÎĽ\mu for an alliance duration Ď„\tau. Both parameters are obtained in two different ways, by comparing knowledge distances from simulations and empirics and by analyzing the collaboration efficiency C^n\mathcal{\hat{C}}_{n}. This is a new measure, that takes also in account the effort of firms to maintain concurrent alliances, and is evaluated via extensive computer simulations. We find that R&D alliances have a duration of around two years and that the subsequent knowledge exchange occurs at a very low rate. Hence, a firm's position in the knowledge space is rather a determinant than a consequence of its R&D alliances. From our data-driven approach we also find model configurations that can be both realistic and optimized with respect to the collaboration efficiency C^n\mathcal{\hat{C}}_{n}. Effective policies, as suggested by our model, would incentivize shorter R&D alliances and higher knowledge exchange rates.Comment: 35 pages, 10 figure

    Quantifying and suppressing ranking bias in a large citation network

    Get PDF
    It is widely recognized that citation counts for papers from different fields cannot be directly compared because different scientific fields adopt different citation practices. Citation counts are also strongly biased by paper age since older papers had more time to attract citations. Various procedures aim at suppressing these biases and give rise to new normalized indicators, such as the relative citation count. We use a large citation dataset from Microsoft Academic Graph and a new statistical framework based on the Mahalanobis distance to show that the rankings by well known indicators, including the relative citation count and Google's PageRank score, are significantly biased by paper field and age. Our statistical framework to assess ranking bias allows us to exactly quantify the contributions of each individual field to the overall bias of a given ranking. We propose a general normalization procedure motivated by the z-score which produces much less biased rankings when applied to citation count and PageRank score

    Reconstructing signed relations from interaction data

    Full text link
    Positive and negative relations play an essential role in human behavior and shape the communities we live in. Despite their importance, data about signed relations is rare and commonly gathered through surveys. Interaction data is more abundant, for instance, in the form of proximity or communication data. So far, though, it could not be utilized to detect signed relations. In this paper, we show how the underlying signed relations can be extracted with such data. Employing a statistical network approach, we construct networks of signed relations in four communities. We then show that these relations correspond to the ones reported in surveys. Additionally, the inferred relations allow us to study the homophily of individuals with respect to gender, religious beliefs, and financial backgrounds. We evaluate the importance of triads in the signed network to study group cohesion.Comment: 14 pages, 3 figures, submitte

    Adapting to Disruptions: Flexibility as a Pillar of Supply Chain Resilience

    Full text link
    Supply chain disruptions cause shortages of raw material and products. To increase resilience, i.e., the ability to cope with shocks, substituting goods in established supply chains can become an effective alternative to creating new distribution links. We demonstrate its impact on supply deficits through a detailed analysis of the US opioid distribution system. Reconstructing 40 billion empirical distribution paths, our data-driven model allows a unique inspection of policies that increase the substitution flexibility. Our approach enables policymakers to quantify the trade-off between increasing flexibility, i.e., reduced supply deficits, and increasing complexity of the supply chain, which could make it more expensive to operate

    Modeling social resilience: Questions, answers, open problems

    Full text link
    Resilience denotes the capacity of a system to withstand shocks and its ability to recover from them. We develop a framework to quantify the resilience of highly volatile, non-equilibrium social organizations, such as collectives or collaborating teams. It consists of four steps: (i) \emph{delimitation}, i.e., narrowing down the target systems, (ii) \emph{conceptualization}, .e., identifying how to approach social organizations, (iii) formal \emph{representation} using a combination of agent-based and network models, (iv) \emph{operationalization}, i.e. specifying measures and demonstrating how they enter the calculation of resilience. Our framework quantifies two dimensions of resilience, the \emph{robustness} of social organizations and their \emph{adaptivity}, and combines them in a novel resilience measure. It allows monitoring resilience instantaneously using longitudinal data instead of an ex-post evaluation

    The structure, exchange, and transfer of knowledge in socio-technical systems

    No full text
    This thesis aims to improve our understanding of the role of knowledge in economics and science. We analyze collaboration activities in these two domains, and show how the interactions among firms and among scientists influence the structure and the exchange of knowledge. We also model how the knowledge of these actors defines their collaborations. We show that knowledge is not only a consequence, but also a determinant of collaborations. To capture this interplay, we combine a statistical analysis of patent and publication data with agent-based models of collaboration activities. We follow a data-driven approach to study the structure, exchange, and transfer of knowledge. Specifically, using publication data we proxy the structure of scientific knowledge by reconstructing the citation network between publications. On this network, we quantitatively show that citation patterns strongly differ across time and scientific fields. We also identify the different knowledge of scientists, and quantify their knowledge exchange occurring during collaborations. Similarly, we use patent data to identify firms' knowledge and the knowledge exchange between firms involved in R\&D alliances. Then, to study the transfer of knowledge, we re-construct scientists' career paths by tracing their affiliations reported on their publications. With these paths, we construct the global migration network of scientists at city level, and analyze its topological properties. After analyzing collaborations activities, the exchange, and the transfer of knowledge, we reproduce these using agent-based models that we calibrate an validate against real-world data. In order to capture the very different processes behind these phenomena, we develop three different models. Precisely, to model collaborations activities among firms and their subsequent knowledge exchange, we combine and extend two existing models that captured only one of these phenomena each. Our a new model, instead, is able to simultaneously reproduce both these phenomena. To show how the knowledge differences between scientists determine their collaboration activities, we develop a second model that takes as input only these differences. Then, to model the transfer of knowledge, we develop a third agent-based model that reproduces scientists' migration at city level and the observed topological properties of the global migration network. Finally, we show that citation patterns between journals and scientists' career paths are better modeled by a new mathematical framework defined by higher-order networks than by traditional network models. By this, we challenge the application of the traditional network perspective to model the flow of knowledge between journals and the transfer of knowledge across research institutes

    When standard network measures fail to rank journals: A theoretical and empirical analysis

    No full text
    Journal rankings are widely used and are often based on citation data in combination with a network approach. We argue that some of these network-based rankings can produce misleading results. From a theoretical point of view, we show that the standard network modeling approach of citation data at the journal level (i.e., the projection of paper citations onto journals) introduces fictitious relations among journals. To overcome this problem, we propose a citation path approach, and empirically show that rankings based on the network and the citation path approach are very different. Specifically we use MEDLINE, the largest open-access bibliometric data set, listing 24,135 journals, 26,759,399 papers, and 323,356,788 citations. We focus on PageRank, an established and well-known network metric. Based on our theoretical and empirical analysis, we highlight the limitations of standard network metrics and propose a method to overcome them.ISSN:2641-333

    Reproducing scientists’ mobility: a data-driven model

    No full text
    High skill labour is an important factor underpinning the competitive advantage of modern economies. Therefore, attracting and retaining scientists has become a major concern for migration policy. In this work, we study the migration of scientists on a global scale, by combining two large data sets covering the publications of 3.5 million scientists over 60 years. We analyse their geographical distances moved for a new affiliation and their age when moving, this way reconstructing their geographical “career paths”. These paths are used to derive the world network of scientists’ mobility between cities and to analyse its topological properties. We further develop and calibrate an agent-based model, such that it reproduces the empirical findings both at the level of scientists and of the global network. Our model takes into account that the academic hiring process is largely demand-driven and demonstrates that the probability of scientists to relocate decreases both with age and with distance. Our results allow interpreting the model assumptions as micro-based decision rules that can explain the observed mobility patterns of scientists.ISSN:2045-232
    corecore